Appl. No. 10/019,882 

Amdt. Dated December 15, 2008 

Reply to Final Office Action of September 26, 2008 



Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings, of claims in the application: 

Listing of Claims: 

1. (canceled) A method comprising: 

(a) calculating estimated weights for identified errors in recognition of utterances based 
on a reference string; 

(b) marking sections of the utterances as being misrecognized and associating the 
estimated weights with the sections of the utterances; and 

(c) using the weighted sections of the utterances to convert a speaker independent model 
to a speaker dependent model. 

2. (currently amended) The method of claim [[1]] 4, wherein parts (a) - (c) are 
repeated at least once. 

3. (currently amended) The method of claim [[1]] 4, wherein the utterances are 
converted into a recognized phone string a first time through applying the speaker independent 
model and thereafter through applying the most recently obtained speaker dependent model. 

4. (previously presented) A method comprising: 

(a) calculating estimated weights for identified errors in recognition of utterances based 
on a reference string; 

(b) marking sections of the utterances as being misrecognized and associating the 
estimated weights with the sections of the utterances; and 

(c) using the weighted sections of the utterances to convert a speaker independent model 
to a speaker dependent model; 

wherein calculating the estimated weights comprises computing an average likelihood 
difference per frame and then computing a weight value by averaging the average likelihood 
difference over error words. 
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5. (canceled) The method of claim 1, wherein calculating the estimated weights 
comprises computing an average likelihood difference per frame according to equation (1) as 
follows: 

Ln H " 



H n_ H n R n_ R n (1), 



where isa log likelihood of hypothesis word n, h£ is a beginning frame index (in time), 
and H" [ s an end frame index, and /?£ , R l ar »d R " are counter parts for the reference string. 

6. (currently amended) T-he A method of claim 5, comprising: 

(a) calculating estimated weights for identified errors in recognition of utterances based 
on a reference string: 

(b) marking sections of the utterances as being misrecognized and associating the 
estimated weights with the sections of the utterances: and 

(c) using the weighted sections of the utterances to convert a speaker independent model 
to a speaker dependent model; 

wherein calculating the estimated weights further comprising comprises computing an 
average likelihood difference per frame according to equation (1) as follows: 

Ln = — *Z ^_ m 

H?-H» R n e -R n b ^ 



where is a log likelihood of hypothesis word n, is a beginning frame index (in 

time), and H" is an end frame index, and R'l 1 R l and K are counter parts for the reference 

string, and computing a weight for misrecognized words of a particular speaker "i" according to 
equation (2) as follows: 
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1 m 

w i = — * Z\Ln\ (2), wherein m 

is a number of misrecognized words. 

7. (currently amended) The method of claim [[1]] 4, wherein for a particular 
speaker, different misrecognized words have different weights. 

8. (canceled) A method comprising: 

(a) recognizing utterances through converting the utterances into a recognized string; 

(b) comparing the recognized string with a reference string to determine errors; 

(c) calculating estimated weights for sections of the utterances; 

(d) marking the errors in the utterances and providing corresponding estimated weights to 
form adaptation enrollment data; and 

(e) using the adaptation enrollment data to convert a speaker independent model to a 
speaker dependent model. 

9. (currently amended) The method of claim [[8]] 12, wherein the utterances are 
converted into the recognized string through applying the speaker independent model. 

10. (currently amended) The method of claim [[8]] 12, wherein parts (b) - (e) are 
repeated until differences between the reference and recognized strings are less than a threshold. 

11. (currently amended) The method of claim [[8]] 12, wherein the utterances are 
converted into a recognized string a first time through applying the speaker independent model 
and thereafter through applying the most recently obtained speaker dependent model. 

12. (previously presented) A method comprising: 

(a) recognizing utterances through converting the utterances into a recognized string; 

(b) comparing the recognized string with a reference string to determine errors; 

(c) calculating estimated weights for sections of the utterances; 

(d) marking the errors in the utterances and providing corresponding estimated weights to 
form adaptation enrollment data; and 

(e) using the adaptation enrollment data to convert a speaker independent model to a 
speaker dependent model; 
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wherein calculating the estimated weights comprises computing an average likelihood 
difference per frame and then computing a weight value by averaging the average likelihood 
difference over all error words. 

13. (canceled) The method of claim 8, wherein calculating the estimated weights 
comprises calculating an average likelihood difference per frame according to equation (1) as 
follows: 

Hi R? 

Ln = ^ ^ m 

H n_ H n R n_ R n K h 

where is a log likelihood of hypothesis word n, is a beginning frame index (in time), 
and H" is an end frame index, and R'[ , R \] and R'J are counter parts for the reference string. 

14. (currently amended) : Fhe A method of claim 13, comprising: 

(a) recognizing utterances through converting the utterances into a recognized string; 

(b) comparing the recognized string with a reference string to determine errors; 

(c) calculating estimated weights for sections of the utterances; 

(d) marking the errors in the utterances and providing corresponding estimated weights to 
form adaptation enrollment data; and 

(e) using the adaptation enrollment data to convert a speaker independent model to a 
speaker dependent model; 

wherein calculating the estimated weights comprises calculating an average likelihood 
difference per frame according to equation (1) as follows: 

Ln = — & ^_ m 

H n e -H n b R n e -R n b ^ 



where is a log likelihood of hypothesis word n, h£ is a beginning frame index (in 
time), and H" is an end frame index, and R'l * R l and K are counter parts for the reference 
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string, and calculating a weight for misrecognized words of a particular speaker "i" is calculated 
according to equation (2) as follows: 
I m 

W i = ~ * ZILrel (2), wherein m 

m n = l 

is a number of misrecognized words. 

15. (currently amended) The method of claim [[8]] 12, wherein for a particular 
speaker, different misrecognized words have different weights. 

16. (canceled) An article of manufacture comprising: 

a computer-readable storage medium having executable instructions thereon which when 
executed cause a processor to perform operations comprising: 

(a) calculating estimated weights for identified errors in recognition of utterances based 
on a reference string; 

(b) marking sections of the utterances as being misrecognized and associating the 
estimated weights with the sections of the utterances; and 

(c) using the weighted sections of the utterances to convert a speaker independent model 
to a speaker dependent model. 

17. (currently amended) The article of manufacture of claim ±6 19, wherein parts (a) 
- (c) are repeated at least once. 

18. (currently amended) The article of manufacture of claim T6 19, wherein the 
utterances are converted into a recognized phone string a first time through applying the speaker 
independent model and thereafter through applying the most recently obtained speaker 
dependent model. 

19. (previously presented) An article of manufacture comprising: 

a computer-readable storage medium having executable instructions thereon which when 
executed cause a processor to perform operations comprising: 

(a) calculating estimated weights for identified errors in recognition of utterances based 
on a reference string; 
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(b) marking sections of the utterances as being misrecognized and associating the 
estimated weights with the sections of the utterances; and 

(c) using the weighted sections of the utterances to convert a speaker independent model 
to a speaker dependent model; 

wherein the estimated weights are computed through computing an average likelihood 
difference per frame and then computing a weight value by averaging the average likelihood 
difference over error words. 

20. (canceled) The article of manufacture of claim 16, wherein an average likelihood 
difference per frame is used to calculate the estimated weights and is computed according to 
equation (1) as follows: 

where is a log likelihood of hypothesis word n, h£ is a beginning frame index (in time), 
and Hg is an end frame index, and Rl , R% an d R e are counter parts for the reference string. 

21 . (currently amended) T-he An article of manufacture of claim 20, comprising: 

a computer-readable storage medium having executable instructions thereon which when 
executed cause a processor to perform operations comprising: 

(a) calculating estimated weights for identified errors in recognition of utterances based 
on a reference string; 

(b) marking sections of the utterances as being misrecognized and associating the 
estimated weights with the sections of the utterances; and 

(c) using the weighted sections of the utterances to convert a speaker independent model 
to a speaker dependent model; 

wherein an average likelihood difference per frame is used to calculate the estimated 
weights and is computed according to equation (1) as follows: 



Ln= / „ 

H" - Hi R" - Rl 



(11 
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where is a log likelihood of hypothesis word n, is a beginning frame index (in 

time), and H" is an end frame index, and R'i a R l and K are counter parts for the reference 
string, and 

a weight for misrecognized words of a particular speaker "i" is calculated according to 
equation (2) as follows: 
I m 

W i = ~ * ZlLnl (2), wherein m 

m n = l 

a number of misrecognized words. 

22. (currently amended) The article of manufacture of claim 4-6 19, wherein for a 
particular speaker, different misrecognized words have different weights. 

23. (canceled) An article of manufacture comprising: 

a computer-readable storage medium having executable instructions thereon which when 
executed cause a processor to perform operations comprising: 

(a) recognizing utterances through converting the utterances into a recognized phone 

string; 

(b) comparing the recognized string with a reference string to determine errors; 

(c) calculating estimated weights for sections of the utterances; 

(d) marking the errors in the utterances and providing corresponding estimated weights to 
form adaptation enrollment data; and 

(e) using the adaptation enrollment data to convert a speaker independent model to a 
speaker dependent model. 

24. (currently amended) The article of manufacture of claim 23- 27, wherein the 
utterances are converted into the recognized string through applying the speaker independent 
model. 

25. (currently amended) The article of manufacture of claim 23 27, wherein parts (b) 
- (e) are repeated until differences between the reference and recognized strings are less than a 
threshold. 
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26. (currently amended) The article of manufacture of claim 23- 27, wherein the 
utterances are converted into a recognized string a first time through applying the speaker 
independent model and thereafter through applying the most recently obtained speaker 
dependent model. 

27. (previously presented) An article of manufacture comprising: 

a computer-readable storage medium having executable instructions thereon which when 
executed cause a processor to perform operations comprising: 

(a) recognizing utterances through converting the utterances into a recognized phone 

string; 

(b) comparing the recognized string with a reference string to determine errors; 

(c) calculating estimated weights for sections of the utterances; 

(d) marking the errors in the utterances and providing corresponding estimated weights to 
form adaptation enrollment data; and 

(e) using the adaptation enrollment data to convert a speaker independent model to a 
speaker dependent model; 

wherein the estimated weights are computed through computing an average likelihood 
difference per frame and then computing a weight value by averaging the average likelihood 
difference over error words. 

28. (canceled) The article of manufacture of claim 23, wherein an average likelihood 
difference per frame is used to calculate the estimated weights and is calculated according to the 
equation (1) as follows: 



H n_ H n R n_ R n 



(1), 



where is a log likelihood of hypothesis word n, H% is a beginning frame index (in time), 
and H" i s an end frame index, and Rl , R b and R" are counter parts for the reference string. 

29. (currently amended) The An article of manufacture of claim 28, comprising: 
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a computer-readable storage medium having executable instructions thereon which when 
executed cause a processor to perform operations comprising: 

(a) recognizing utterances through converting the utterances into a recognized phone 

string; 

(b) comparing the recognized string with a reference string to determine errors; 

(c) calculating estimated weights for sections of the utterances; 

(d) marking the errors in the utterances and providing corresponding estimated weights to 
form adaptation enrollment data; and 

(e) using the adaptation enrollment data to convert a speaker independent model to a 
speaker dependent model; 

wherein an average likelihood difference per frame is used to calculate the estimated 
weights and is calculated according to the equation (1) as follows: 

Ln = — ^— m 

H n e -H n b R n e -R n h 



where is a log likelihood of hypothesis word n, H' b l is a beginning frame index (in 

time), and H" is an end frame index, and R'l a R b and K are counter parts for the reference 
string, and 

a weight for misrecognized words of a particular speaker "i" is calculated according to 
equation (2) as follows: 
I m 

w i = — * Z\Ln\ (2), wherein m 

m n = 1 

is a number of misrecognized words. 

30. (currently amended) The article of manufacture of claim 23 27, wherein for a 
particular speaker, different misrecognized words have different weights. 

31. (new) The method of claim 4 wherein calculating the estimated weights further 
comprises: 
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running a force alignment program on the reference string to obtain statistics of 
references; 

decoding the utterances to obtain statistics of 1-best hypothesis; and 

aligning the 1-best hypothesis with the reference string to obtain the error words. 

32 (new) The method of claim 6 herein calculating the estimated weights further 
comprises: 

running a force alignment program on the reference string to obtain statistics of 
references; 

decoding the utterances to obtain statistics of 1-best hypothesis; and 

aligning the 1-best hypothesis with the reference string to obtain the error words. 

33 (new) The method of claim 1 wherein calculating the estimated weights further 
comprises: 

running a force alignment program on the reference string to obtain statistics of 
references; 

decoding the utterances to obtain statistics of 1-best hypothesis; and 

aligning the 1-best hypothesis with the reference string to obtain the error words. 

34. (new) The article of manufacture of claim 19 wherein the executable instructions 
causing the processor to perform calculating estimated weights comprises executable instructions 
thereon which when executed cause the processor to perform operations comprising: 

running a force alignment program on the reference string to obtain statistics of 
references; 

decoding the utterances to obtain statistics of 1-best hypothesis; and 

aligning the 1-best hypothesis with the reference string to obtain the error words. 

35. (new) The article of manufacture of claim 23 wherein the executable instructions 
causing the processor to perform calculating estimated weights comprises executable instructions 
thereon which when executed cause the processor to perform operations comprising: 

running a force alignment program on the reference string to obtain statistics of 
references; 
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decoding the utterances to obtain statistics of 1-best hypothesis; and 

aligning the 1-best hypothesis with the reference string to obtain the error words. 

36. (new) The article of manufacture of claim 27 wherein the executable instructions 
causing the processor to perform calculating estimated weights comprises executable instructions 
thereon which when executed cause the processor to perform operations comprising: 

running a force alignment program on the reference string to obtain statistics of 
references; 

decoding the utterances to obtain statistics of 1-best hypothesis; and 

aligning the 1-best hypothesis with the reference string to obtain the error words. 

37. (new) The article of manufacture of claim 29 wherein the executable instructions 
causing the processor to perform calculating estimated weights comprises executable instructions 
thereon which when executed cause the processor to perform operations comprising: 

running a force alignment program on the reference string to obtain statistics of 
references; 

decoding the utterances to obtain statistics of 1-best hypothesis; and 

aligning the 1-best hypothesis with the reference string to obtain the error words. 
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